fix(bedrock): enable websearch_interception with extended thinking on Bedrock by Quentin-M · Pull Request #20489 · BerriAI/litellm

Quentin-M · 2026-02-05T12:23:56Z

Summary

Rebased on BerriAI/litellm main (Feb 18, 2026) with the following fixes on top of cherry-picked PR #20488:

Websearch Interception

Cherry-pick updated PR Fix websearch interception with extended thinking mode support #20488 — thinking block preservation through websearch agentic loop
Load api_key/api_base from router's search_tools config (fixes "TAVILY_API_KEY is not set")
Auto-adjust max_tokens when <= thinking.budget_tokens (Anthropic requires max_tokens > budget_tokens)

Bedrock

Centralized beta header filtering with version-based support (replaces inconsistent per-API filtering)
Fix version extraction regex in beta headers config
Strip context_management from request body for all Bedrock APIs (Invoke Messages, Invoke Chat, Converse)

Thinking

Drop thinking param when assistant messages have text without thinking blocks
Recognize adaptive thinking type in is_thinking_enabled (Opus 4.6)

Test Plan

All websearch interception tests pass (51 passed)
All bedrock beta header tests pass (69 passed)
All thinking tests pass (8 passed)
Ruff linting passes
Docker image build + deploy

🤖 Generated with Claude Code

vercel · 2026-02-05T12:24:01Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Feb 25, 2026 0:26am

greptile-apps · 2026-02-05T12:27:25Z

Greptile Overview

Greptile Summary

This PR adds two enhancements to the websearch interception handler: extracting API keys from router configuration and handling thinking parameter constraints for Anthropic models.

Key Changes:

Extracts api_key and api_base from router's search_tools config and passes them to litellm.asearch()
Adjusts max_tokens when it's less than or equal to budget_tokens by setting it to budget_tokens + DEFAULT_MAX_TOKENS (4096)
Conditionally drops the thinking parameter when assistant messages with tool_calls have no thinking_blocks

Critical Issue Found:

Lines 440 and 443 pass anthropic_messages (the imported module) instead of follow_up_messages (the messages list) to the helper functions. This will cause the thinking parameter logic to fail completely as it's analyzing the wrong data type.

Confidence Score: 1/5

This PR contains a critical bug that will cause runtime failures
The wrong variable (imported module instead of messages list) is passed to helper functions, causing the thinking parameter logic to fail. API key extraction looks correct but cannot be safely merged until the critical bug is fixed.
litellm/integrations/websearch_interception/handler.py requires immediate attention - lines 439-444 must be fixed before merge

Important Files Changed

Filename	Overview
litellm/integrations/websearch_interception/handler.py	Added API key handling from router config and thinking parameter logic, but critical bug: wrong variable passed to helper functions (module instead of messages list)

Sequence Diagram

sequenceDiagram
    participant Client
    participant WebSearchHandler
    participant Router
    participant LLM
    participant SearchProvider

    Client->>WebSearchHandler: Request with websearch tool
    WebSearchHandler->>WebSearchHandler: Pre-request hook: convert native tools
    WebSearchHandler->>LLM: Initial request
    LLM-->>WebSearchHandler: Response with tool_use blocks
    WebSearchHandler->>WebSearchHandler: Detect websearch tool_use
    
    Note over WebSearchHandler,Router: Extract API keys from router config
    WebSearchHandler->>Router: Get search_tools config
    Router-->>WebSearchHandler: search_provider, api_key, api_base
    
    loop For each search query
        WebSearchHandler->>SearchProvider: Execute search (with api_key)
        SearchProvider-->>WebSearchHandler: Search results
    end
    
    Note over WebSearchHandler: Check thinking parameter
    alt thinking.budget_tokens > max_tokens
        WebSearchHandler->>WebSearchHandler: Adjust max_tokens = budget_tokens + 4096
    end
    
    alt Last tool_call message has no thinking_blocks
        WebSearchHandler->>WebSearchHandler: Drop thinking parameter
    end
    
    WebSearchHandler->>LLM: Follow-up request with search results
    LLM-->>WebSearchHandler: Final response
    WebSearchHandler-->>Client: Return final response

greptile-apps

_{1 file reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

litellm/integrations/websearch_interception/handler.py

jquinter · 2026-02-05T17:23:33Z

Hey @Quentin-M, nice PR — both fixes address real production issues and the code is clean. A couple of things to address before merge:

1. Missing `litellm.modify_params` guard

The Anthropic and Bedrock transformations you're following both gate the thinking drop behind litellm.modify_params:

# litellm/llms/anthropic/chat/transformation.py:1114
if litellm.modify_params:
    optional_params.pop("thinking", None)

Your code drops thinking unconditionally. Users who have modify_params disabled would get unexpected behavior. The fix is straightforward — wrap the drop:

if should_drop_thinking:
    if litellm.modify_params:
        params_to_exclude.append('thinking')
        verbose_logger.warning(
            "WebSearchInterception: Dropping 'thinking' param because the last assistant message "
            "with tool_calls has no thinking_blocks. The model won't use extended thinking for this turn."
        )

2. Tests required

The project requires at least 1 test in tests/litellm/. Both the API key extraction logic and the thinking parameter handling are testable with mocks — e.g. mock llm_router.search_tools to verify api_key/api_base are extracted, and mock anthropic_messages_optional_request_params with various thinking/max_tokens combos.

Side note: Greptile flagged a "critical bug" claiming anthropic_messages (the module) was passed instead of follow_up_messages to the helper functions. That's incorrect — your code correctly passes follow_up_messages. Just wanted to flag that so you can disregard it.

ghost · 2026-02-06T15:01:44Z

@greptile

greptile-apps · 2026-02-06T15:07:06Z

Greptile Overview

Greptile Summary

Enables Bedrock + Claude extended/adaptive thinking to work with agentic tools by preserving/validating thinking blocks across the websearch interception loop.
Centralizes Bedrock anthropic-beta whitelisting/translation with model version/family gating and uses it across Invoke Chat, Invoke Messages, and Converse.
Strips context_management from Bedrock request bodies (header-only feature) and tweaks Router to expose resolved provider to callbacks.
Adds Opus 4.6 adaptive thinking mappings and test coverage for thinking-block detection + beta header filtering.

Confidence Score: 2/5

This PR has one clear merge-blocking correctness issue in Bedrock model ID constants; remaining changes look reasonable.
Most changes are additive with tests (thinking block detection, beta header filtering, websearch interception follow-up), but the Opus 4.6 Bedrock Converse model ID in constants drops the ':0' suffix which will break routing/matching for that model until corrected.
litellm/constants.py (BEDROCK_CONVERSE_MODELS Opus 4.6 entry)

Important Files Changed

Filename	Overview
litellm/constants.py	Updates BEDROCK_CONVERSE_MODELS; Opus 4.6 entry drops ':0' suffix which breaks matching against other Bedrock IDs.
litellm/integrations/websearch_interception/handler.py	Fixes websearch interception loop to preserve kwargs, thinking blocks, and to load search tool credentials from router config.
litellm/integrations/websearch_interception/transformation.py	Returns structured TransformRequestResult including tool_calls and thinking blocks; prepends thinking blocks to follow-up assistant message.
litellm/litellm_core_utils/core_helpers.py	Extends internal param filtering to handle prefixes and centralizes internal key lists.
litellm/llms/anthropic/chat/transformation.py	Adds Opus 4.6 adaptive thinking mapping and drops thinking when last assistant message lacks thinking blocks.
litellm/llms/bedrock/beta_headers_config.py	Introduces centralized whitelist/translation for Bedrock anthropic-beta headers with version/family gating.
litellm/llms/bedrock/chat/converse_transformation.py	Uses centralized beta filter; strips unsupported context_management body param; adds Opus 4.6 adaptive thinking path.
litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py	Filters/translates beta headers via centralized filter and strips context_management from body for Invoke Messages.
litellm/router.py	Stores resolved custom_llm_provider into deployment params to make provider visible to callbacks post-alias resolution.
litellm/utils.py	Adds helper to detect missing thinking blocks in last assistant message; minor BedrockModelInfo import refactor.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Router as litellm.Router
    participant WSI as WebSearchInterceptionLogger
    participant Provider as Bedrock/Anthropic
    participant Search as litellm.asearch

    Client->>Router: completion(model alias, messages, tools, thinking)
    Router->>WSI: pre_api_call(kwargs incl. resolved custom_llm_provider)
    WSI->>WSI: convert hosted web_search tool -> regular tool
    WSI-->>Router: return {**kwargs, tools: converted}
    Router->>Provider: initial LLM request
    Provider-->>Router: response(content blocks)
    Router->>WSI: async_post_call_success(response, kwargs)
    WSI->>WSI: transform_request() extracts tool_use + thinking blocks
    alt has websearch tool_use
        WSI->>Search: asearch(query, provider, api_key/api_base from router.search_tools)
        Search-->>WSI: search result(s)
        WSI->>WSI: transform_response() builds assistant msg (thinking + tool_use) + user tool_result
        WSI->>Provider: follow-up LLM request(max_tokens adjusted if <= thinking budget)
        Provider-->>WSI: final response
    else no websearch
        WSI-->>Router: passthrough
    end

greptile-apps

_{10 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

litellm/constants.py

litellm/litellm_core_utils/core_helpers.py

jquinter · 2026-02-06T15:45:06Z

Review Findings

Thanks for the comprehensive work on enabling websearch + extended thinking on Bedrock!

CI Status

Check	Status	Notes
Lint	❌ FAIL	`PLR0915: Too many statements (51 > 50)` in handler.py:333
Tests	❌ FAIL	Multiple beta header tests failing
CLA	✅ Pass
Vercel	✅ Pass

Must Fix

1. Lint failure: Function too long

integrations/websearch_interception/handler.py:333:15: PLR0915 Too many statements (51 > 50)

The _execute_agentic_loop function now has 51 statements, exceeding Ruff's limit. Consider extracting some logic into helper functions (e.g., _validate_max_tokens_for_thinking()).

2. Multiple test failures in beta headers code

Several new tests are failing:

test_context_management_requires_claude_4_5
test_backward_compatibility_existing_headers
test_converse_anthropic_model_gets_anthropic_beta
test_advanced_tool_use_header_translation_for_opus_4_5
test_converse_filters_unsupported_headers
And 10+ more...

The beta header filtering logic appears to have bugs in version/model matching that need investigation.

3. Missing `litellm.modify_params` guard

The existing Anthropic and Bedrock transformations gate thinking drop behind litellm.modify_params:

# Anthropic transformation.py:1114
if litellm.modify_params:
    optional_params.pop("thinking", None)

The PR drops thinking unconditionally without checking this flag.

Positive Aspects

Centralized beta header filtering (beta_headers_config.py) is a good architectural improvement
TransformRequestResult NamedTuple makes the transform_request contract explicit
Comprehensive test coverage - 2,400+ lines of new tests
Good documentation - New README.md in litellm/llms/bedrock/
Real user-facing bugs fixed - Bedrock/Claude with websearch + thinking

Suggestion

Consider waiting for upstream PRs (#20488, #20514, #20519) to merge first, then rebasing. This would reduce the PR size and avoid potential merge conflicts with poetry.lock.

greptile-apps · 2026-02-18T15:57:26Z

Greptile Summary

This PR implements three main feature areas: (1) websearch interception with extended thinking support on Bedrock, (2) centralized beta header filtering for all Bedrock APIs, and (3) improved thinking param handling for assistant messages without thinking blocks.

Key concerns:

Removes native structured outputs for Bedrock Converse — _supports_native_structured_outputs(), _create_output_config_for_response_format(), _add_additional_properties_to_schema(), and outputConfig handling were all deleted. This removes production functionality for Claude 4.5+, Qwen3, Mistral, and DeepSeek models. Multiple existing tests (at least 5) will break.
Renames _is_nova_2_model to _is_nova_lite_2_model — narrows matching to exclude Nova-2-Pro, which is a listed model with reasoning support. Existing tests call the old method name and will fail with AttributeError. Nova-2-Pro will no longer receive correct reasoningConfig parameters.
Removes Sonnet 4.6 from interleaved thinking and tool search support — patterns for Sonnet 4.6 were stripped from _supports_interleaved_thinking_on_bedrock() and _supports_tool_search_on_bedrock().

What works well:

The centralized BedrockBetaHeaderFilter in beta_headers_config.py is well-designed with version-based filtering, family restrictions, and header translation
Thinking block preservation through the websearch agentic loop is thoroughly implemented and tested
Loading api_key/api_base from router's search_tools config fixes the "TAVILY_API_KEY is not set" issue
New tests are comprehensive and mock-based (no real network calls)
The max_tokens auto-adjustment when <= thinking.budget_tokens follows existing patterns

Confidence Score: 2/5

This PR introduces regressions by removing native structured outputs and Nova-2-Pro reasoning support, which will break existing tests and functionality.
While the websearch interception and beta header centralization changes are well-implemented and well-tested, the converse_transformation.py changes silently remove production features (native structured outputs, Nova-2-Pro reasoning) that have existing test coverage. These removals will cause test failures and functional regressions for users relying on those features.
litellm/llms/bedrock/chat/converse_transformation.py requires immediate attention — it removes native structured outputs and Nova-2-Pro reasoning support. litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py needs verification of Sonnet 4.6 pattern removals.

Important Files Changed

Filename	Overview
litellm/constants.py	Removes `:0` suffix from Claude Opus 4.6 model ID in BEDROCK_CONVERSE_MODELS. Minor change, already discussed in prior review thread.
litellm/integrations/websearch_interception/handler.py	Major changes: thinking block preservation through websearch agentic loop, API key/base loading from router search_tools config, max_tokens auto-adjustment for thinking budget. The max_tokens adjustment only checks `type == "enabled"` but not `"adaptive"` (minor inconsistency).
litellm/integrations/websearch_interception/transformation.py	Refactored to use NamedTuple return types (TransformedRequest, TransformedResponse). Added thinking block capture and preservation. Clean, well-tested changes.
litellm/litellm_core_utils/core_helpers.py	Extracted internal params to module-level constants (INTERNAL_PARAMS, INTERNAL_PARAMS_PREFIXES) with prefix-based filtering. Added `_is_param_internal()` helper. Mostly formatting changes alongside the refactor.
litellm/llms/anthropic/chat/transformation.py	Adds `last_assistant_message_has_no_thinking_blocks` check alongside existing tool_calls check to drop thinking param when assistant messages have text but no thinking blocks.
litellm/llms/base_llm/chat/transformation.py	Adds `"adaptive"` thinking type recognition in `is_thinking_enabled()`, supporting Opus 4.6. Small, targeted change.
litellm/llms/bedrock/beta_headers_config.py	New centralized module for Bedrock beta header filtering with version-based model support, family restrictions, and header translations. Well-documented and extensible design.
litellm/llms/bedrock/chat/converse_transformation.py	CRITICAL: Removes native structured outputs support (_supports_native_structured_outputs, _create_output_config_for_response_format, _add_additional_properties_to_schema, outputConfig handling) and renames _is_nova_2_model to _is_nova_lite_2_model (excluding Nova-2-Pro). These removals break existing tests and remove production features.
litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py	Integrates centralized beta header filter with translation support, strips context_management, removes Sonnet 4.6 patterns from interleaved thinking support. Simplifies tool search beta header logic.
litellm/router.py	Stores custom_llm_provider in deployment's litellm_params after alias resolution, enabling callbacks (websearch_interception) to access the resolved provider.
litellm/utils.py	Adds `_message_has_thinking_blocks()` helper and `last_assistant_message_has_no_thinking_blocks()` function. Refactors existing `any_assistant_message_has_thinking_blocks` to use shared helper. Well-tested changes.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User Request with web_search tool] --> B{WebSearchInterceptionLogger\nasync_log_pre_api_call}
    B -->|Provider not enabled| C[Pass through to LLM]
    B -->|Provider enabled| D[Convert web_search to\nLiteLLM standard tool]
    D --> E[LLM Response]
    E --> F{async_should_run_agentic_loop}
    F -->|No WebSearch tool_use| G[Return response]
    F -->|WebSearch tool_use detected| H[Extract tool_calls +\nthinking_blocks]
    H --> I[_execute_agentic_loop]
    I --> J[Load search credentials\nfrom router search_tools]
    J --> K[Execute parallel searches\nvia litellm.asearch]
    K --> L[Build follow-up messages\nwith thinking + tool_result]
    L --> M{max_tokens <= budget_tokens?}
    M -->|Yes| N[Adjust max_tokens =\nbudget + DEFAULT_MAX_TOKENS]
    M -->|No| O[Keep original max_tokens]
    N --> P[anthropic_messages.acreate\nfollow-up request]
    O --> P
    P --> Q[Return final response]

    subgraph Beta Header Filtering
        R[anthropic-beta headers] --> S{BedrockBetaHeaderFilter}
        S --> T[Whitelist check]
        T --> U[Version-based filtering]
        U --> V[Family restrictions]
        V --> W[Header translation\ne.g. advanced-tool-use]
        W --> X[Filtered headers to AWS]
    end

_{Last reviewed commit: ab743af}

greptile-apps

_{18 files reviewed, 5 comments}

_{Edit Code Review Agent Settings | Greptile}

litellm/llms/bedrock/chat/converse_transformation.py

greptile-apps · 2026-02-18T15:57:42Z

Additional Comments (2)

litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py
Sonnet 4.6 removed from interleaved thinking support

The patterns for sonnet-4.6, sonnet_4.6, sonnet-4-6, sonnet_4_6 were removed from _supports_interleaved_thinking_on_bedrock(). If Sonnet 4.6 models exist in deployment, they will no longer get interleaved thinking beta headers auto-injected. Was this intentional? Was removing Sonnet 4.6 from interleaved thinking support intentional? If Sonnet 4.6 does support interleaved thinking on Bedrock, these patterns should be restored.

litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py
Sonnet 4.6 also removed from tool search support

Similarly to the interleaved thinking removal above, Sonnet 4.6 patterns were removed from _supports_tool_search_on_bedrock(). And in _get_tool_search_beta_header_for_bedrock() (line 289), the check was simplified to just "opus-4" in model.lower(), which means Sonnet 4.5 would also no longer get the tool-search-tool-2025-10-19 header via this code path.

However, the centralized BedrockBetaHeaderFilter handles tool-search family restrictions to allow both opus and sonnet families at 4.5+, so this may be partially mitigated by the beta header filter's own logic. Worth verifying this doesn't cause a regression for Sonnet 4.5 tool search on the Messages API. Can you confirm that the centralized beta header filter properly handles tool-search injection for Sonnet 4.5+ on the Messages API, given that _get_tool_search_beta_header_for_bedrock now only injects it for Opus 4 models?

When Anthropic's extended thinking is enabled, assistant messages must start with thinking blocks before tool_use blocks. The agentic loop was creating follow-up messages with only tool_use blocks, causing validation errors. This change ensures thinking blocks from the original response are preserved and included at the start of follow-up assistant messages. - Created `TransformRequestResult` NamedTuple to capture both tool_calls and thinking_blocks from `transform_request()`, making the contract explicit and extensible - Modified `transform_request()` to extract and return thinking/redacted_thinking blocks alongside tool calls - Updated `transform_response()` to accept thinking_blocks and prepend them to follow-up assistant messages - Passed thinking_blocks through the agentic loop chain: detection → execution → message transformation - Fixed `transform_request()` to return full kwargs (not just tools) to preserve other request parameters - Used `filter_internal_params()` utility instead of manual filtering for consistency This change fixes websearch interception when extended thinking mode is enabled. **Problem**: When Anthropic's extended thinking is enabled, assistant messages must start with thinking blocks before tool_use blocks. The agentic loop was creating follow-up messages with only tool_use blocks, causing the error: `messages.1.content.0.type: Expected 'thinking' or 'redacted_thinking', but found 'tool_use'` **Solution**: Modified `transform_request()` to capture thinking/redacted_thinking blocks from the original response, and `transform_response()` to include them at the start of the assistant message in follow-up requests. **Testing**: Successfully tested end-to-end with Claude Code → LiteLLM Proxy → AWS Bedrock → Claude Opus 4.5. ```yaml model_list: - model_name: claude-opus-4-5-20251101 litellm_params: model: bedrock/us.anthropic.claude-opus-4-5-20251101-v1:0 aws_region_name: us-west-2 model_info: supports_web_search: true litellm_settings: callbacks: ["websearch_interception"] websearch_interception_params: enabled_providers: ["bedrock"] search_tool_name: "searxng-search" search_tools: - search_tool_name: searxng-search litellm_params: search_provider: searxng api_base: "https://searxng.example.com" ``` **Note**: Uses `bedrock/` (not `bedrock/converse/`) to route through `anthropic_messages_handler()` which supports agentic hooks.

Fixes issue where websearch interception failed with "TAVILY_API_KEY is not set" error when using search providers that require API keys. Changes: - Extract api_key and api_base from router search_tools configuration - Pass credentials to litellm.asearch() when available - Falls back to environment variables when credentials not in config - Maintains backward compatibility with existing configurations Root cause: Handler was only extracting search_provider from router config, but not the associated api_key and api_base fields. This caused litellm.asearch() to fall back to environment variables, which failed when keys weren't set in env. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Fixes websearch interception failures when thinking.budget_tokens is set and requests violate Anthropic's requirement: max_tokens > budget_tokens. Changes: - Validate max_tokens against thinking.budget_tokens when extended thinking is enabled - Automatically adjust max_tokens to budget_tokens + DEFAULT_MAX_TOKENS (4096) when insufficient - Follows the same pattern as base transformation classes in LiteLLM This prevents the error: "max_tokens must be greater than thinking.budget_tokens" when using extended thinking with websearch interception. Related issue: BerriAI#14194 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

…pport Standardize anthropic-beta header handling across all Bedrock APIs (Invoke Chat, Converse, Messages) using a centralized whitelist-based filter with version-based model support. - Inconsistent filtering: Invoke Chat used whitelist (safe), Converse/Messages used blacklist (allows unsupported headers through) - Production risk: unsupported headers could cause AWS API errors - Maintenance burden: adding new Claude models required updating multiple hardcoded lists - Centralized BedrockBetaHeaderFilter with whitelist approach - Version-based filtering (e.g., "requires 4.5+") instead of model lists - Family restrictions (opus/sonnet/haiku) when needed - Automatic header translation for backward compatibility - Add `litellm/llms/bedrock/beta_headers_config.py` - BedrockBetaHeaderFilter class - Whitelist of 11 supported beta headers - Version/family restriction logic - Debug logging support - Invoke Chat: Replace local whitelist with centralized filter - Converse: Remove blacklist (30 lines), use whitelist filter - Messages: Remove complex filter (55 lines), preserve translation - Add `tests/test_litellm/llms/bedrock/test_beta_headers_config.py` - 40+ unit tests for filter logic - Extend `tests/test_litellm/llms/bedrock/test_anthropic_beta_support.py` - 13 integration tests for API transformations - Verify filtering, version restrictions, translations - Add `litellm/llms/bedrock/README.md` - Maintenance guide for adding new headers/models - Enhanced module docstrings with examples - Production safety: only whitelisted headers reach AWS - Zero maintenance for new Claude models (Opus 5, Sonnet 5, etc.) - Consistent filtering across all 3 APIs - Preserved backward compatibility (advanced-tool-use translation) ```bash poetry run pytest tests/test_litellm/llms/bedrock/ -v ``` Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

…ock APIs Bedrock doesn't support context_management as a request body parameter. The feature is enabled via the anthropic-beta header (context-management-2025-06-27) which was already handled correctly. Leaving context_management in the body causes: "context_management: Extra inputs are not permitted" Strip the parameter from all 3 Bedrock API paths: - Invoke Messages API - Invoke Chat API - Converse API (additionalModelRequestFields) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…without thinking blocks Follow-up to a494503f4b which fixed thinking + tool_use. That fix only detected missing thinking blocks on assistant messages with tool_calls. When the last assistant message has plain text content (no tool_calls), the check returned False and thinking was not dropped, causing: "Expected thinking or redacted_thinking, but found text" Add last_assistant_message_has_no_thinking_blocks() to detect any assistant message with content but no thinking blocks. Extract shared _message_has_thinking_blocks() helper that checks both the thinking_blocks field and content array for thinking/redacted_thinking blocks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Upstream only checks for type="enabled" but Opus 4.6 uses type="adaptive". Without this fix, max_tokens auto-adjustment doesn't trigger for adaptive thinking, causing API errors.

fix bedrock pii redaction null value handling

CLAassistant · 2026-02-25T12:19:55Z

All committers have signed the CLA.

Quentin-M · 2026-02-26T13:55:23Z

@ryangoldblatt-bm to sign the CLAs 🙏

ryangoldblatt-bm · 2026-02-26T14:10:19Z

@ryangoldblatt-bm to sign the CLAs 🙏

Done sir, shall we reopen this?

vercel bot deployed to Preview February 5, 2026 12:25 View deployment

greptile-apps bot reviewed Feb 5, 2026

View reviewed changes

litellm/integrations/websearch_interception/handler.py Outdated Show resolved Hide resolved

Quentin-M force-pushed the search_tools_fix branch from ef0f21a to ff825ed Compare February 5, 2026 12:51

vercel bot deployed to Preview February 5, 2026 12:53 View deployment

Quentin-M marked this pull request as draft February 5, 2026 13:05

jquinter mentioned this pull request Feb 5, 2026

Fix websearch interception with extended thinking mode support #20488

Open

Quentin-M force-pushed the search_tools_fix branch from ff825ed to 9e243eb Compare February 5, 2026 22:02

vercel bot deployed to Preview February 5, 2026 22:04 View deployment

Quentin-M force-pushed the search_tools_fix branch from 9e243eb to a8b16f5 Compare February 5, 2026 23:29

vercel bot deployed to Preview February 5, 2026 23:31 View deployment

vercel bot deployed to Preview February 6, 2026 02:39 View deployment

Quentin-M force-pushed the search_tools_fix branch from 440e33a to e9c975c Compare February 6, 2026 03:23

vercel bot deployed to Preview February 6, 2026 03:25 View deployment

Quentin-M force-pushed the search_tools_fix branch from e9c975c to d67ca1d Compare February 6, 2026 03:59

Quentin-M changed the title ~~fix(websearch): handle API keys from router config and thinking parameter constraints~~ fix(bedrock): enable websearch_interception with extended thinking on Bedrock Feb 6, 2026

vercel bot deployed to Preview February 6, 2026 04:01 View deployment

Quentin-M force-pushed the search_tools_fix branch from d67ca1d to c93a203 Compare February 6, 2026 07:16

vercel bot deployed to Preview February 6, 2026 07:18 View deployment

greptile-apps bot reviewed Feb 6, 2026

View reviewed changes

litellm/constants.py Show resolved Hide resolved

litellm/litellm_core_utils/core_helpers.py Outdated Show resolved Hide resolved

litellm/litellm_core_utils/core_helpers.py Outdated Show resolved Hide resolved

Quentin-M force-pushed the search_tools_fix branch from c93a203 to 18dc198 Compare February 13, 2026 15:10

vercel bot deployed to Preview February 13, 2026 15:12 View deployment

Quentin-M force-pushed the search_tools_fix branch from 18dc198 to ab743af Compare February 18, 2026 15:41

vercel bot deployed to Preview February 18, 2026 15:43 View deployment

Quentin-M marked this pull request as ready for review February 18, 2026 15:48

Quentin-M mentioned this pull request Feb 18, 2026

fix(bedrock): Filter anthropic-beta header for Bedrock passthrough #20012

Open

2 tasks

greptile-apps bot reviewed Feb 18, 2026

View reviewed changes

litellm/llms/bedrock/chat/converse_transformation.py Outdated Show resolved Hide resolved

litellm/llms/bedrock/chat/converse_transformation.py Show resolved Hide resolved

litellm/llms/bedrock/chat/converse_transformation.py Outdated Show resolved Hide resolved

Quentin-M force-pushed the search_tools_fix branch from 32f4311 to 029bc83 Compare February 18, 2026 16:15

vercel bot deployed to Preview February 18, 2026 16:16 View deployment

mpcusack-altos and others added 3 commits February 18, 2026 22:20

Quentin-M force-pushed the search_tools_fix branch from 029bc83 to 4365ae0 Compare February 18, 2026 22:29

vercel bot deployed to Preview February 18, 2026 22:31 View deployment

Quentin-M and others added 4 commits February 18, 2026 23:35

fix(thinking): recognize adaptive thinking type in is_thinking_enabled

2350657

Upstream only checks for type="enabled" but Opus 4.6 uses type="adaptive". Without this fix, max_tokens auto-adjustment doesn't trigger for adaptive thinking, causing API errors.

Quentin-M force-pushed the search_tools_fix branch from 4365ae0 to 2350657 Compare February 18, 2026 22:36

vercel bot deployed to Preview February 18, 2026 22:38 View deployment

michelligabriele mentioned this pull request Feb 19, 2026

fix(websearch_interception): preserve thinking blocks in agentic loop follow-up messages #21604

Merged

7 tasks

ryangoldblatt-bm and others added 2 commits February 25, 2026 10:24

fix bedrock pii redaction null value handling

af93e50

Merge pull request #2 from Quentin-M/fix/bedrock-pii-redaction

baefc29

fix bedrock pii redaction null value handling

vercel bot deployed to Preview February 25, 2026 12:26 View deployment

Quentin-M closed this Feb 26, 2026

Quentin-M deleted the search_tools_fix branch February 26, 2026 08:16

Quentin-M mentioned this pull request Mar 13, 2026

fix(bedrock/thinking/websearch): beta headers, thinking fixes, API key loading, guardrails null safety #23523

Open

5 tasks

Uh oh!

Conversation

Quentin-M commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Websearch Interception

Bedrock

Thinking

Test Plan

Uh oh!

vercel bot commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Feb 5, 2026

Greptile Overview

Greptile Summary

Confidence Score: 1/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jquinter commented Feb 5, 2026

1. Missing litellm.modify_params guard

2. Tests required

Uh oh!

ghost commented Feb 6, 2026

Uh oh!

greptile-apps bot commented Feb 6, 2026

Greptile Overview

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jquinter commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review Findings

CI Status

Must Fix

1. Lint failure: Function too long

2. Multiple test failures in beta headers code

3. Missing litellm.modify_params guard

Positive Aspects

Suggestion

Uh oh!

greptile-apps bot commented Feb 18, 2026

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot commented Feb 18, 2026

Uh oh!

CLAassistant commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Quentin-M commented Feb 26, 2026

Uh oh!

ryangoldblatt-bm commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Quentin-M commented Feb 5, 2026 •

edited

Loading

vercel bot commented Feb 5, 2026 •

edited

Loading

1. Missing `litellm.modify_params` guard

jquinter commented Feb 6, 2026 •

edited

Loading

3. Missing `litellm.modify_params` guard

CLAassistant commented Feb 25, 2026 •

edited

Loading